Goto

Collaborating Authors

 nearest neighbor classifier



Statistical Guarantees of Distributed Nearest Neighbor Classification

Neural Information Processing Systems

Nearest neighbor is a popular nonparametric method for classification and regression with many appealing properties. In the big data era, the sheer volume and spatial/temporal disparity of big data may prohibit centrally processing and storing the data. This has imposed considerable hurdle for nearest neighbor predictions since the entire training data must be memorized. One effective way to overcome this issue is the distributed learning framework. Through majority voting, the distributed nearest neighbor classifier achieves the same rate of convergence as its oracle version in terms of the regret, up to a multiplicative constant that depends solely on the data dimension. The multiplicative difference can be eliminated by replacing majority voting with the weighted voting scheme. In addition, we provide sharp theoretical upper bounds of the number of subsamples in order for the distributed nearest neighbor classifier to reach the optimal convergence rate. It is interesting to note that the weighted voting scheme allows a larger number of subsamples than the majority voting one. Our findings are supported by numerical studies.


On Convergence of Nearest Neighbor Classifiers over Feature Transformations

Neural Information Processing Systems

The k-Nearest Neighbors (kNN) classifier is a fundamental non-parametric machine learning algorithm. However, it is well known that it suffers from the curse of dimensionality, which is why in practice one often applies a kNN classifier on top of a (pre-trained) feature transformation. From a theoretical perspective, most, if not all theoretical results aimed at understanding the kNN classifier are derived for the raw feature space. This leads to an emerging gap between our theoretical understanding of kNN and its practical applications. In this paper, we take a first step towards bridging this gap.


Conformal Safety Monitoring for Flight Testing: A Case Study in Data-Driven Safety Learning

Feldman, Aaron O., Harp, D. Isaiah, Duncan, Joseph, Schwager, Mac

arXiv.org Artificial Intelligence

We develop a data-driven approach for runtime safety monitoring in flight testing, where pilots perform maneuvers on aircraft with uncertain parameters. Because safety violations can arise unexpectedly as a result of these uncertainties, pilots need clear, preemptive criteria to abort the maneuver in advance of safety violation. To solve this problem, we use offline stochastic trajectory simulation to learn a calibrated statistical model of the short-term safety risk facing pilots. We use flight testing as a motivating example for data-driven learning/monitoring of safety due to its inherent safety risk, uncertainty, and human-interaction. However, our approach consists of three broadly-applicable components: a model to predict future state from recent observations, a nearest neighbor model to classify the safety of the predicted state, and classifier calibration via conformal prediction. We evaluate our method on a flight dynamics model with uncertain parameters, demonstrating its ability to reliably identify unsafe scenarios, match theoretical guarantees, and outperform baseline approaches in preemptive classification of risk.



On a Theory of Nonparametric Pairwise Similarity for Clustering: Connecting Clustering to Classification

Yingzhen Yang, Feng Liang, Shuicheng Yan, Zhangyang Wang, Thomas S. Huang

Neural Information Processing Systems

The success of pairwise clustering largely depends on the pairwise similarity function defined over the data points, where kernel similarity is broadly used. In this paper, we present a novel pairwise clustering framework by bridging the gap between clustering and multi-class classification. This pairwise clustering framework learns an unsupervised nonparametric classifier from each data partition, and search for the optimal partition of the data by minimizing the generalization error of the learned classifiers associated with the data partitions. We consider two nonparametric classifiers in this framework, i.e. the nearest neighbor classifier and the plug-in classifier. Modeling the underlying data distribution by nonparametric kernel density estimation, the generalization error bounds for both unsupervised nonparametric classifiers are the sum of nonparametric pairwise similarity terms between the data points for the purpose of clustering. Under uniform distribution, the nonparametric similarity terms induced by both unsupervised classifiers exhibit a well known form of kernel similarity. We also prove that the generalization error bound for the unsupervised plugin classifier is asymptotically equal to the weighted volume of cluster boundary [1] for Low Density Separation, a widely used criteria for semi-supervised learning and clustering. Based on the derived nonparametric pairwise similarity using the plug-in classifier, we propose a new nonparametric exemplar-based clustering method with enhanced discriminative capability, whose superiority is evidenced by the experimental results.


Review for NeurIPS paper: On Convergence of Nearest Neighbor Classifiers over Feature Transformations

Neural Information Processing Systems

Summary and Contributions: Update: Thanks for addressing the concerns raised by the reviewers, based on re-reading the paper and going over the comments, I am able to understand the experiments better - and based on the authors comments that they will revise the draft to make things more clear, I will change my score to accept. Having said that, I would still keep my confidence low since I am unable to accurately access the significance of the result and I believe that would be a key factor to consider in a novel theoretical paper. The result is based on two key properties of the transformed space that they identify. The first is'safety', which is a measure of how well can we recover the posterior in the original space from the feature space. The second is smoothness, which is a measure of how hard it is to recover the posterior in the original space from the feature space.


Review for NeurIPS paper: On Convergence of Nearest Neighbor Classifiers over Feature Transformations

Neural Information Processing Systems

This paper provides some interesting theoretical insights into the convergence of kNN over feature transformations. This is backed up by some empirical results. All three reviewers argue for acceptance, but have also provided some directions for improvement, which was acknowledged by the authors in their feedback, promising to include these changes in the final version. Personally I have one issue with the paper, which is introducing some datasets in the experimental section, without providing any results. These are supplied in the additional material, to me that feels like cheating.


On Convergence of Nearest Neighbor Classifiers over Feature Transformations

Neural Information Processing Systems

The k-Nearest Neighbors (kNN) classifier is a fundamental non-parametric machine learning algorithm. However, it is well known that it suffers from the curse of dimensionality, which is why in practice one often applies a kNN classifier on top of a (pre-trained) feature transformation. From a theoretical perspective, most, if not all theoretical results aimed at understanding the kNN classifier are derived for the raw feature space. This leads to an emerging gap between our theoretical understanding of kNN and its practical applications. In this paper, we take a first step towards bridging this gap.


Statistical Guarantees of Distributed Nearest Neighbor Classification

Neural Information Processing Systems

Nearest neighbor is a popular nonparametric method for classification and regression with many appealing properties. In the big data era, the sheer volume and spatial/temporal disparity of big data may prohibit centrally processing and storing the data. This has imposed considerable hurdle for nearest neighbor predictions since the entire training data must be memorized. One effective way to overcome this issue is the distributed learning framework. Through majority voting, the distributed nearest neighbor classifier achieves the same rate of convergence as its oracle version in terms of the regret, up to a multiplicative constant that depends solely on the data dimension.